Custom Open vSwitch Actions
A short tutorial on extending Open vSwitch.
For a project I’ve been working on (reimplementing some past work), being able to easily enable pushback at various points in the network is fairly important. For this, I wanted to try something I’d both learn from and be able to use as a springboard for later development and testing – mininet (and by extension, Open vSwitch) seemed like a good fit. Part of this requires a new feature to be hacked on top of OpenFlow: probabilistic packet dropping. I’ve written this short walkthrough/tutorial on the process for the benefit of anyone looking to make their own modifications. A full repository is included here.
[EDIT 2019-03-27]: Updated for OVS revision 8e73833.
Mininet relies on virtual ethernet links and network namespaces to allow simulated communication between switches and nodes: switches use Open vSwitch (OVS) and are configured with the OpenFlow protocol. As-is, the OpenFlow protocol offers meters which are capable of rate-limiting traffic, but Open vSwitch doesn’t support these in its kernel datapath. To do what I need, I have to modify OVS (which is a little less than ideal)!
Thankfully, other kind folks have attempted and documented their process (sort-of). While this this mailing list entry) was immensely helpful, it doesn’t really get around to explaining what the necessary changes mean or give concrete examples for some of the more ambiguous areas. The official advice is less helpful still: compile with –enable-Werror and see what breaks. Hopefully, this short tutorial will be a helpful supplement!
Preliminaries
On your development machine, follow the official guide on installing
OVS from source. Any dependencies and build tools needed are all listed,
and I strongly recommend configuring with --enable-Werror
.
If you’re running mininet from the official VM, this short guide will walk you through uninstalling the default version of OVS and ensuring that your new kernel modules will be preferred.
OpenFlow Protocol and Action Codes
First things first, you’ll need to introduce your own action code as an extension to OpenFlow.
lib/ofp-actions.c:
enum ofp_raw_action_type {
/* ... */
/* NX1.3+(47): struct nx_action_decap, ... */
NXAST_RAW_DECAP,
/* OF1.0+(29): uint32_t. */
OFPAT_RAW_PROBDROP,
/* ... */
}
Here I’ve added in a new action as seen on the wire: OFPAT_RAW_PROBDROP
.
The comment you prepend your new entry with is very important – some headers
are autogenerated according to the protocol version, your chosen code, and the
argument type your action needs. The file explains how to do this in detail.
For me at least, choosing an integral type was
a necessary consideration: we will be touching kernel code, and are forced to
avoid floats.
include/openvswitch/ofp-actions.h:
#define OFPACTS
/* ... */
OFPACT(GOTO_TABLE, ofpact_goto_table, ofpact, "goto_table") \
OFPACT(PROBDROP, ofpact_probdrop, ofpact, "probdrop")
/* ..., after "struct ofpact_decap { ... }" */
/* OFPACT_PROBDROP.
*
* Used for OFPAT_PROBDROP */
struct ofpact_probdrop {
OFPACT_PADDED_MEMBERS(
struct ofpact ofpact;
uint32_t prob; /* Uint probability, "covers" 0->1 range. */
);
uint8_t data[];
};
Firstly, through macro magic you’re defining OFPACT_PROBDROP
to be seen
elsewhere as a new OFP Action (with a type which is 1 greater than GOTO_TABLE).
The OFPACT
statement also states that your new action will have arguments of
the form struct ofpact_probdrop
, will not be variable-length, and can be
added at the command line as probdrop:<something>
.
(If you need variable length arguments, the comment at the head of that file is helpful.)
We now need to define codes for that action as it is seen internally by
ovs-vswitchd
, the kernel module, and the various configuation utilities.
datapath/linux/compat/include/linux/openvswitch.h:
enum ovs_action_attr {
/* ... */
/*
* after #ifndef __KERNEL__ ... #endif.
* the equals is thus ABSOLUTELY NECESSARY
*/
OVS_ACTION_ATTR_PROBDROP = 23, /* unit32_t, prob in [0,2^32 -1] */
__OVS_ACTION_ATTR_MAX, /* Nothing past this will be accepted
* from userspace. */
/* ... */
}
While this is fairly straightforward, placing the new action here is important
for maintaining compatibility. There’s a subtle “gotcha” here to be aware of, that
if we don’t specify an explicit value for this enum entry then the kernel and
userland portions of ovs-vswitchd
will use different codes for the new action.
This was, for me at least, an awful debugging experience that I wouldn’t wish on
anyone else.
Action Behaviour
Most likely, you’ll be using OVS with the kernel module. As such, it’s your first priority to implement your action there.
datapath/actions.c:
/* Ask for a random number.
"p" is the amount we should let through, here true means drop,
false means let it pass on */
static bool prob_drop(uint32_t prob)
{
/* since we can't use rand() in the kernel */
return prandom_u32() > prob;
}
static int do_execute_actions(/* ... */)
{
/* ... */
switch (nla_type(a)) {
/* ... */
case OVS_ACTION_ATTR_PROBDROP:
/* No need to free, taken care of for us
This function just reads the attribute to
know if we should drop. */
if(prob_drop(nla_get_u32(a)))
{
while (rem) {
a = nla_next(a, &rem);
}
}
break;
}
/* ... */
}
Adding this new case here is the only thing that needs to be done. Note the
nla_get_whatever(a)
functions in use here; these are used to read any
arguments passed alongside the action from the netlink socket connecting this
module with ovs-vswitchd
. Compare against the existing implementation to get
a feel for what functions you should be using to read your argument.
lib/odp-execute.c:
static bool
requires_datapath_assistance(const struct nlattr *a)
{
enum ovs_action_attr type = nl_attr_type(a);
switch (type) {
/* ... */
case OVS_ACTION_ATTR_PROBDROP:
return false;
/* ... */
}
}
This is fairly context sensitive, though most actions shouldn’t need this.
Userland Actions
If you’d like to implement your action in the userspace datapath for whatever
reason, you should add/modify these files. At the very least, you must add in
an OVS_NOT_REACHED()
case.
lib/packets.h:
/* ... */
bool prob_drop(uint32_t prob);
#endif /* packets.h */
lib/packets.c:
/* Ask for a random number.
"p" is the amount we should let through, here true means drop,
false means let it pass on */
bool
prob_drop(uint32_t prob)
{
unsigned int roll_i;
random_bytes(&roll_i, sizeof(roll_i));
return roll_i > prob;
}
lib/dpif-netdev.c:
static void
dp_execute_cb( /* ... */ )
{
/* ... */
switch ((enum ovs_action_attr)type) {
/* ... */
case OVS_ACTION_ATTR_PROBDROP:
OVS_NOT_REACHED();
}
}
lib/dpif.c:
static void
dpif_execute_helper_cb( /* ... */ )
{
/* ... */
switch ((enum ovs_action_attr)type) {
/* ... */
case OVS_ACTION_ATTR_PROBDROP:
OVS_NOT_REACHED();
}
}
ofproto/ofproto-dpif-ipfix.c:
void
dpif_sflow_read_actions( /* ... */ )
{
switch (type) {
/* ... */
case OVS_ACTION_ATTR_PROBDROP:
/* Ignore sFlow for now, unless needed. */
break;
}
}
/* ... */
void
dpif_ipfix_read_actions( /* ... */ )
{
/* ... */
switch (type) {
/* ... */
case OVS_ACTION_ATTR_PROBDROP:
/* Again, ignore for now. Not needed. */
break;
}
}
lib/odp-execute.c:
void
odp_execute_actions( /* ... */ )
{
/* ... */
switch ((enum ovs_action_attr)type) {
/* ... */
case OVS_ACTION_ATTR_PROBDROP: {
size_t i;
const size_t num = dp_packet_batch_size(batch);
DP_PACKET_BATCH_REFILL_FOR_EACH (i, num, packet, batch) {
if (!prob_drop(nl_attr_get_u32(a))) {
dp_packet_batch_refill(batch, packet, i);
} else {
dp_packet_delete(packet);
}
}
break;
}
}
}
I mostly avoided these because they weren’t necessary, but these are useful starting points.
Action Processing
To help ovs-vswitchd
handle the new action, we need to modify a fair amount
of switch statements so that it knows what characteristics the action has, how
to properly iterate over it, and how to validate any arguments included.
Part of this includes classifying it such that ovs knows how are action
interacts with the action set used by WRITE_ACTIONS
.
Thankfully, the function names are mostly self-explanatory here!
lib/ofp-actions.c:
struct ofpact *
ofpact_next_flattened(const struct ofpact *ofpact)
{
switch (ofpact->type) {
/* ... */
case OFPACT_PROBDROP:
return ofpact_next(ofpact);
}
/* ... */
}
/* ... */
enum ovs_instruction_type
ovs_instruction_type_from_ofpact_type(enum ofpact_type type)
{
switch (type) {
/* ... */
case OFPACT_PROBDROP:
default:
return OVSINST_OFPIT11_APPLY_ACTIONS;
/* ... */
}
}
/* ... */
static bool
ofpact_outputs_to_port(const struct ofpact *ofpact, ofp_port_t port)
{
switch (ofpact->type) {
/* ... */
case OFPACT_PROBDROP:
default:
return false;
}
}
For how your action does (or does not) execute when placed in an action set, pick one of the following:
lib/ofp-actions.c:
/* Choose whether to place the last instance of an action into
* the action set, allow duplicates or disallow it. */
/* ONCE */
#define ACTION_SET_ORDER \
SLOT(OFPACT_PROBDROP) \
SLOT(OFPACT_STRIP_VLAN) \
/* ... */
/* OUTPUT-LIKE */
#define ACTION_SET_FINAL_PRIORITY \
SLOT(OFPACT_PROBDROP) \
SLOT(OFPACT_CT) \
/* ... */
static enum action_set_class
action_set_classify(const struct ofpact a*)
{
switch (a->type) {
/* ... */
/* MULTIPLE */
/* ... */
case OFPACT_PROBDROP:
return ACTION_SLOT_OR_SET_MOVE;
/* NEVER */
/* ... */
case OFPACT_PROBDROP:
return ACTION_SLOT_INVALID;
/* ... */
}
}
Command Parsing/Formatting/Conversion
Now, within the same file (lib/ofp-actions.c) we must write some new methods
to help parse and print our action when we query or alter the switch with
ovs-ofctl
. The function names are derived from one of the autogenerated
headers mentioned earlier, but have this format:
lib/ofp-actions.c:
/* ..., immediately after format_GOTO_TABLE */
/* Okay, the new stuff! */
/* Encoding the action packet to put on the wire. */
static void
encode_PROBDROP(const struct ofpact_probdrop *prob,
enum ofp_version ofp_version OVS_UNUSED,
struct ofpbuf *out)
{
uint32_t p = prob->prob;
put_OFPAT_PROBDROP(out, p);
}
/* Reversing the process. */
static enum ofperr
decode_OFPAT_RAW_PROBDROP(uint32_t prob,
enum ofp_version ofp_version OVS_UNUSED,
struct ofpbuf *out)
{
struct ofpact_probdrop *op;
op = ofpact_put_PROBDROP(out);
op->prob = prob;
return 0;
}
/* Helper for below. */
static char * OVS_WARN_UNUSED_RESULT
parse_prob(char *arg, struct ofpbuf *ofpacts)
{
struct ofpact_probdrop *probdrop;
uint32_t prob;
char *error;
error = str_to_u32(arg, &prob);
if (error) return error;
probdrop = ofpact_put_PROBDROP(ofpacts);
probdrop->prob = prob;
return NULL;
}
/* Go from string-formatted args into an action struct.
e.g. ovs-ofctl add-flow ... actions=probdrop:3000000000,output:"s2-eth0"
*/
static char * OVS_WARN_UNUSED_RESULT
parse_PROBDROP(char *arg, const struct ofpact_parse_params *pp)
{
return parse_prob(arg, pp->ofpacts);
}
/* Used when printing info to console. */
static void
format_PROBDROP(const struct ofpact_probdrop *a,
const struct ofpact_format_params *fp)
{
/* Feel free to use e.g. colors.param,
colors.end around parameter names */
ds_put_format(fp->s, "probdrop:%"PRIu32, a->prob);
}
/* ... */
static enum ofperr
check_PROBDROP(const struct ofpact_probdrop *a OVS_UNUSED,
const struct ofpact_check_params *cp OVS_UNUSED)
{
/* My method needs no checking. Probably. */
return 0;
}
The ofpact_put_PROBDROP
and put_OFPAT_PROBDROP
are, again, autogenerated.
If you want more complex handling or multiple parameters, you may consider:
/* Simple key-value walker. Useful for complex actions. */
char *key;
char *value;
while (ofputil_parse_key_value(&arg, &key, &value)) {
if (!strcmp(key, "P_send")) {
error = str_to_u32(value, &prob);
} else {
return xasprintf("invalid key '%s' in probdrop argument",
key);
}
if (error) return error;
}
Finally, some more clerical printing duties! While these are for pretty-printing etc., I’m not sure where their output applies.
lib/odp-util.c:
static void
format_odp_action( /* ... */ )
{
/* ... */
switch (type) {
/* ... */
case OVS_ACTION_ATTR_PROBDROP:
ds_put_format(ds, "pdrop(%"PRIu32")", nl_attr_get_u32(a));
break;
/* ... */
}
}
static int
odp_action_len(uint16_t type)
{
/* ... */
switch ((enum ovs_action_attr) type) {
/* ... */
case OVS_ACTION_ATTR_PROBDROP: return sizeof(uint32_t);
}
}
Action Translation – Kernel-User Interface
The final areas that need to be fixed up handle the communication between the kernel datapath and the user-level daemon: part of this process is action translation (xlate) between the protocol and the physical action.
For some context, the daemon and the kernel module communicate to one another via a Netlink socket. The daemon sends flow actions down to the kernel on arrival (for packet handling), and polls for any upcalls from the kernel as they arrive. Typically, these occur when a packet arrives which does not match any known entries (i.e., the packet must then be sent to the controller, or a wildcard rule needs to be concretely instantiated). Using these is fairly simple – if you’re unfamiliar, they’re host only so we don’t need to think about byte order.
ofproto/ofproto-dpif-xlate.c:
/* Put this with the other "compose" functions. */
static void
compose_probdrop_action(struct xlate_ctx *ctx, struct ofpact_probdrop *op)
{
uint32_t prob = op->prob;
nl_msg_put_u32(ctx->odp_actions, OVS_ACTION_ATTR_PROBDROP, prob);
}
/* ... */
static void
do_xlate_actions( /* ... */ )
{
switch (a->type) {
/* ... */
case OFPACT_PROBDROP:
compose_probdrop_action(ctx, ofpact_get_PROBDROP(a));
break;
}
}
/* ... */
static bool
xlate_fixup_actions()
{
/* ... */
switch ((enum ovs_action_attr) type) {
/* ... */
case OVS_ACTION_ATTR_PROBDROP:
ofpbuf_put(b, a, nl_attr_len_pad(a, left));
/* ... */
}
/* ... */
}
/* ... */
/* No action can undo the packet drop: reflect this. */
static bool
reversible_actions(const struct ofpact *ofpacts, size_t ofpacts_len)
{
const struct ofpact *a;
OFPACT_FOR_EACH (a, ofpacts, ofpacts_len) {
switch (a->type) {
/*... */
case OFPACT_PROBDROP:
return false;
}
}
return true;
}
/* ... */
/* PROBDROP likely doesn't require explicit thawing. */
static void
freeze_unroll_actions( /* ... */ )
{
/* ... */
switch (a->type) {
case OFPACT_PROBDROP:
/* These may not generate PACKET INs. */
break;
}
}
/* ... */
/* Naturally, don't need to recirculate since we don't change packets. */
static void
recirc_for_mpls(const struct ofpact *a, struct xlate_ctx *ctx)
{
/* ... */
switch (a->type) {
case OFPACT_PROBDROP:
default:
break;
}
}
The most important parts of translation are do_xlate_actions
and
compose_probdrop_action
, which iterate over an action set and transmit one
probdrop
action respectively. nl_msg_put_u32
writes both the action type
(OVS_ACTION_ATTR_PROBDROP
) and the argument value along the socket to be read
by the datapath module. What you do for the rest of the functions here really
depends on the semantics of your action, so I suggest you compare against what
other actions require.
datapath/flow_netlink.c:
static int __ovs_nla_copy_actions( /*...*/ )
{
/* ... */
static const u32 action_lens[OVS_ACTION_ATTR_MAX + 1] = {
/* ... */
[OVS_ACTION_ATTR_PROBDROP] = sizeof(u32),
};
/* ... */
/* Be careful here, your compiler may not catch this one
* even with -Werror */
switch (type) {
/* ... */
case OVS_ACTION_ATTR_PROBDROP:
/* Finalest sanity checks in the kernel. */
break;
/* ... */
}
/* ... */
}
Back to the kernel at last! The final changes we need to make here are simple enough; they’re essentially last-stop sanity checks on the argument length and value. Not much really needs to be said.
You should be more or less good to go now: just compile, fix and iterate!
Testing
Run mininet, add rules and test as you need:
sh ovs-ofctl add-flow s1 \
in_port="s1-eth1",actions=probdrop:1000000000,"s2-eth2"
h1 ping h2
Alternatively, you can build control messages yourself (to manually send over sockets) using a library like twink for python:
import twink.ofp5 as ofp
import twink.ofp5.build as ofpb
import twink.ofp5.parse as ofpp
flow_pdrop_msg = ofpb.ofp_flow_mod(
None, 0, 0, 0, ofp.OFPFC_ADD,
0, 0, 1, None, None, None, 0, 1,
ofpb.ofp_match(None, None, None),
ofpb.ofp_instruction_actions(ofp.OFPIT_WRITE_ACTIONS, None, [
# 29 is the number I picked for Pdrop.
# 0xffffffff allows all packets through.
ofpb._pack("HHI", 29, 8, 0xffffffff),
ofpb.ofp_action_output(None, 16, 1, 65535)
])
)
Debugging
All the errors that are being thrown at you by --enable-Werror
are first
priority. It’s easy to miss something in a codebase this large.
For the love of god, befriend gdb
. If you think your issues originate in the
userland code, run ps -e | grep ovs-vswitchd
to get the pid, hook in with
sudo gdb ovs-vswitchd <pid>
and have fun.
Kernel problems are a lot more nefarious, and not something you really want to
tussle with for too long. If you’re not used to this (as I wasn’t), you have
dmesg
at your disposal. This will show you any diagnostic messages originating
from the kernel: feel free to sprinkle your own printk
or OVS_NLERR
statements around. I probably wouldn’t advise trying to hook up gdb or anything
like that. For instance, these tools were what helped me unearth the “gotcha” I
mentioned at the start of this guide.
Conclusion
That wraps up my process for adding, testing and debugging a new action for Open vSwitch. Although fairly niche, it was a good experience for me and I hope this helps anyone else looking to toy around for research or out of their own interest!